WePS-3 Evaluation Campaign: Overview of the Web People Search Clustering and Attribute Extraction Tasks
نویسندگان
چکیده
The third WePS (Web People Search) Evaluation campaign took place in 2009-2010 and attracted the participation of 13 research groups from Europe, Asia and North America. Given the top web search results for a person name, two tasks were addressed: a clustering task, which consists of grouping together web pages referring to the same person, and an extraction task, which consists of extracting salient attributes for each of the persons sharing the same name. Continuing the path of previous campaigns, this third evaluation aimed at merging both problems into one single task, where the system must return both the documents and the attributes for each of the different people sharing a given name. This is not a trivial step from the point of view of evaluation: a system may correctly extract attribute profiles from different URLs but then incorrectly merge profiles. This campaign also featured a larger testbed and the participation of a state-of-the-art commercial WePS system in the attribute extraction task. This paper presents the definition, resources, evaluation methodology and results for the clustering and attribute extraction tasks.
منابع مشابه
WePS 2 Evaluation Campaign: Overview of the Web People Search Clustering Task
The second WePS (Web People Search) Evaluation campaign took place in 2008-2009 with the participation of 19 research groups from Europe, Asia and North America. Given the output of a Web Search Engine for a (usually ambiguous) person name as query, two tasks were addressed: a clustering task, which consists of grouping together web pages referring to the same person, and an extraction task, wh...
متن کاملCASIANED: People Attribute Extraction based on Information Extraction
In this paper, we describe the people attribute extraction system of the CASIANED team for the second Web People search evaluation (WePS-2). We develop an attribute extraction system based on information extraction. Firstly the attribute candidates for every attribute class are extracted using several different information extraction techniques; then these candidates are verified through classi...
متن کاملTALP at WePS-3 2010
In this paper we present our system and experiments at the Third Web People Search Workshop (WePS-3) task for clustering web people search documents in English. In our experiments we used a simple approach with three algorithms: Lingo, Hierachical Agglomerative Clustering (HAC), and a 2-step HAC algorithm. We also present the results and initial conclusions in the context of the WePS-3 Task 1 f...
متن کاملCombining Evaluation Metrics with a Unanimous Improvement Ratio and its Application to the Web People Search Clustering Task
This paper presents the Unanimous Improvement Ratio (UIR), a measure that allows to compare systems using two evaluation metrics without dependencies on relative metric weights. For clustering tasks, this kind of measure becomes necessary given the trade-off between precision and recall oriented metrics (e.g. Purity and Inverse Purity) which usually depends on a clustering threshold parameter s...
متن کاملWhich Who are They? People Attribute Extraction and Disambiguation in Web Search Results∗
People name search often returns a lot of Web pages containing the strings of personal names. Due to namesake, extracting target person attributes (such as birthday, occupation, affiliation, nationality, contact information, etc.) is expected to be helpful to differentiate documents related to different people and thus group documents related to the same person. This paper presents the methodol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010